Freeze-Thaw Bayesian Optimization
نویسندگان
چکیده
In machine learning, the term “training” is used to describe the procedure of fitting a model to data. In many popular models, this fitting procedure is framed as an optimization problem, in which a loss is minimized as a function of the parameters. In all but the simplest machine learning models, this minimization must be performed with an iterative algorithm such as stochastic gradient descent or the nonlinear conjugate gradient method. Another aspect of training involves fitting model “hyperparameters.” These are parameters that in some way govern the model space or fitting procedure; in both cases they are typically difficult to minimize directly in terms of the training loss and are usually evaluated in terms of generalization performance via held-out data. Hyperparameters are often regularization penalties such as `p norms on model parameters, but can also capture model capacity as in the number of hidden units in a neural network. These hyperparameters help determine the appropriate bias-variance tradeoff for a given model family and data set. On the other hand, hyperparameters of the fitting procedure govern algorithmic aspects of training, such as the learning rate schedule of stochastic gradient descent, or the width of a Monte Carlo proposal distribution. The goal of fitting both kinds of hyperparameters is to identify a model and an optimization procedure in which successful minimization of training loss is likely to result in good generalization performance. When a heldout validation set is used to evaluate the quality of hyperparameters, the overall optimization proceeds as a double loop, where the outer loop sets the hyperparameters and the inner loop applies an iterative training procedure to fit the model to data. Often this outer hyperparameter optimization is performed by hand, which—even if rigorously systematized— can be a difficult and laborious process. Simple alternatives include the application of heuristics and intuition, grid search, which scales poorly with dimension, or random search [1], which is computationally expensive due to the need to train many models. In light of this, Bayesian optimization [2] has recently been proposed as an effective method for systematically and intelligently setting the hyperparameters of machine learning models [3, 4]. Using a principled characterization of model uncertainty, Bayesian optimization attempts to find the best hyperparameter settings with as few model evaluations as possible. One issue with previously proposed approaches to Bayesian optimization for machine learning is that a model must be fully trained before the quality of its hyperparameters can be assessed. Human experts, however, appear to be able to rapidly assess whether or not a model is likely to eventually be useful, even when the inner-loop training is only partially completed. When such an assessment can be made accurately, it is possible to explore the hyperparameter space more effectively by aborting model fits that are likely to be low quality. The goal of this paper is to take advantage of the partial information provided by iterative training procedures, within the Bayesian optimization framework for hyperparameter search. We propose a new technique that makes it possible to estimate when to pause the training of one model in favor of starting a new one with different hyperparameters, or resuming a partially-completed training procedure from an old model. We refer to our approach as freeze-thaw Bayesian optimization, as the algorithm maintains a set of “frozen” (partially completed but not being actively trained) models and uses an information-theoretic criterion to determine which ones to “thaw” and continue training.
منابع مشابه
Strength Characteristics of Clay Mixtures with Waste Materials in Freeze-Thaw Cycles
Waste tires, rubbers, plastic and steel materials, normally produced in every society, enter the environment and cause serious problems. These problems may, to some extent, be reduced by finding applications for them in engineering, for example, they can be used for geotechnical applications as backfill material and solving problems with low shear strength soils. Such materials may be subjected...
متن کاملEffects of number of freeze-thaw cycles and freezing temperature on mode I and mode II fracture toughness of cement mortar
Natural and artificial materials including rocks and cement-based materials such as concrete and cement mortar are affected both physically and chemically by various natural factors known as weathering factors. The freeze-thaw process, as a weathering factor, considerably affects the properties of rocks and concrete. Therefore, the effect of the freeze-thaw process on the physical and mechanica...
متن کاملThe Effects of Concrete Pavement Mix Design Parameters on Durability under Freeze and Thaw Condition
This paper is based on an experimental research that examined the effects of concrete`s major parameters on durability of concrete pavements and curbs under freezing and thawing cycles. These parameters include concrete mix design parameters such as water-cement ratio, fine aggregate percentage and using air entraining admixture and simulating real freeze-thaw cycles that infrastructures underg...
متن کاملRock Brittleness Prediction Using Geomechanical Properties of Hamekasi Limestone: Regression and Artificial Neural Networks Analysis
The cold climate is a favorable parameter for the development of tension cracks and decrease of rock brittleness. Therefore, this paper attempts to investigate the Hamekasi porous limestone in order to predict the brittleness indices during freeze-thaw cycles. The freeze–thaw test was executed for one cycle including 16 h of freezing, and 8 h of thawing. The geo mechanical properties and brittl...
متن کاملFreeze-Thaw Durability of Air-Entrained Concrete
One of the most damaging actions affecting concrete is the abrupt temperature change (freeze-thaw cycles). The types of deterioration of concrete structures by cyclic freeze-thaw can be largely classified into surface scaling (characterized by the weight loss) and internal crack growth (characterized by the loss of dynamic modulus of elasticity). The present study explored the durability of con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1406.3896 شماره
صفحات -
تاریخ انتشار 2014